Dutch Sublanguage Semantic Tagging combined with Mark-Up Technology

نویسندگان

  • Peter Spyns
  • Ngo Thanh Nhan
  • Erik Baert
  • Naomi Sager
  • Georges De Moor
چکیده

In this paper, we want to show how the morphological component of an existing NLP-system for Dutch (Dutch Medical Language Processor DMLP) has been extended in order to produce output that is compatible with the language independent modules of the LSP-MLP system (Linguistic String Project Medical Language Processor) of the New York University. The former can take advantage of the language independent developments of the latter, while focusing on idiosyncrasies for Dutch. This general strategy will be illustrated by a practical application, namely the highlighting of relevant information in a patient discharge summary (PDS) by means of modern HyperText Mark-Up Language (HTML) technology. Such an application can be of use for medical administrative purposes in a hospital environment.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Interacting Semantic Layers of Annotation in SoNaR, a Reference Corpus of Contemporary Written Dutch

This paper reports on the annotation of a corpus of 1 million words with four semantic annotation layers, including named entities, coreference relations, semantic roles and spatial and temporal expressions. These semantic annotation layers can benefit from the manually verified part of speech tagging, lemmatization and syntactic analysis (dependency tree) information layers which resulted from...

متن کامل

From D-Coi to SoNaR: a reference corpus for Dutch

The computational linguistics community in The Netherlands and Belgium has long recognized the dire need for a major reference corpus of written Dutch. In part to answer this need, the STEVIN programme was established. To pave the way for the effective building of a 500-million-word reference corpus of written Dutch, a pilot project was established. The Dutch Corpus Initiative project or D-Coi ...

متن کامل

Combining Independent Knowledge Sources for Word Sense Disambiguation

Disambiguation Yorick Wilks and Mark Stevenson Department of Computer Science, University of She eld, Regent Court, 211 Portobello Street, She eld S1 4DP, UK fyorick, [email protected] Abstract Sense tagging, the automatic assignment of the appropriate sense from some lexicon to each of the words in a text, is a specialised instance of the general problem of word sense disambiguation. We di...

متن کامل

Information extraction from non-segmented text (on the material of weather forecast telegrams)

Both the domain and sublanguage specific approach to text analysis and information extraction is proposed. Texts under consideration are weather forecast telegrams written in Russian. Telegrams are an example of deviant text type, with lack of text segmentation means, a lot of abbreviations, syntactic and spelling mistakes. The presented work pursues the problem of text segmentation: a procedur...

متن کامل

Cornetto: A Combinatorial Lexical Semantic Database for Dutch

One of the goals of the STEVIN programme is the realisation of a digital infrastructure that will enforce the position of the Dutch language in the modern information and communication technology. A semantic database for Dutch is a crucial component for this infrastructure for three reasons: (1) it enables the development of semantic web applications on top of knowledge and information expresse...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1997